fdaPDE (Palummo et al.,
2025) is a C++ library with an interface to R for
physics-informed spatial and functional data analysis, at the
intersection of statistics and numerical analysis. The library provides
advanced statistical methods designed for data located over complex
spatial domains, ranging from irregular planar regions and curved
surfaces to linear networks and volumes, possibly evolving over time.
The class of methods implemented in fdaPDE features
regularization terms based on Partial Differential Equations (PDEs),
which allow incorporating information derived from the physics of the
problem under study into the statistical modeling. This makes
fdaPDE an extremely flexible tool for the analysis of
complex data. For a review of this class of methods, refer to Sangalli (2021).
fdaPDE offers a wide range of modeling capabilities –
including regression, nonparametric density estimation, functional data
analysis, and more – for data located over a spatial domain, possibly
evolving over time. Among the broad range of modeling capabilities
offered by fdaPDE, we focus here on the generalized
spatial regression method.
Let \(\mathcal{D} \subset \mathbb{R}^d\,,\) with \(d \geq 1\,,\) be a bounded spatial domain of interest in which \(n\) fixed measurement stations are located. Here, we consider \(d = 2\,,\) but the proposed method can handle multidimensional spatial domains with complex shapes, including two-dimensional curved surfaces (Ettinger et al., 2016) and non-convex three-dimensional volumes (Arnone et al., 2023). At the location \(\mathbf{p}_i = (p_{1i}, p_{2i})^\top \in \mathcal{D}\) of each station, we observe a realization \(y_i \in \mathbb{R}\) of the response variable \(Y_i\) under study, along with a set of covariates \(\mathbf{x}_i = \left( x_{i,1},\ldots,x_{i,q} \right)^\top \in \mathbb{R}^q\,,\) if available. Specifically, we assume that \(Y_i\) follows a distribution from the exponential family, with mean \(\mu_i\) and a common scale parameter \(\phi\,,\) for \(i = 1, \ldots, n\,.\) In the context of spatial regression, we present a generalized additive model with Partial Differential Equation (PDE) regularization, which we formulate according to Wilhelm & Sangalli (2016), as follows: \[g(\mu_i) = \theta_i = \mathbf{x}_i^\top \boldsymbol{\beta} + f(\mathbf{p}_i) \,, \qquad i = 1, \ldots, n,\] where:
\(g\) is a continuously differentiable and strictly monotone canonical link function, whereas \(\theta_i = \theta_i(\boldsymbol{\beta}, f)\) is referred to as the \(i\)th canonical parameter;
\(\boldsymbol{\beta} \in \mathbb{R}^q\) is the vector of unknown regression coefficients capturing the mean effects of the covariates;
\(f(\mathbf{p}_i) \in \mathbb{R}\) is the unknown deterministic spatial field \(f : \mathcal{D} \to \mathbb{R}\) at the \(i\)th data location.
To estimate the unknown parameters, we refer to the basic form of the regularized methods, as reviewed in Sangalli (2021) and originally introduced in Sangalli et al. (2013). These works propose to maximize the following penalized functional based on the log-likelihood \(\ell(\cdot ; \boldsymbol{\theta})\) for the considered distribution: \[\sum_{i = 1}^n \ell\left( y_i; \theta_i \right) - R(f; \lambda)\,,\] where \(R(f; \lambda)\) is the regularization term that depends on the smoothing parameter \(\lambda \in \mathbb{R}^+\,.\) The regularization term is defined using PDEs, with differential operators encoding the smoothness of \(f\) over \(\mathcal{D}\,.\) This term may also incorporate problem-specific information into the modeling framework, whenever available. The functional above embodies the trade-off between data fidelity and model fidelity through the least-squares term and the misfit with respect to the PDE, respectively. The balance between these two contributions is governed by the smoothing parameter. When the considered distribution is Gaussian, the optimization of the functional above coincides with the one that follows from the basic form of spatial regression with PDE regularization.
In its basic form, the regularization term is given by: \[R(f; \lambda) = \lambda \int_\mathcal{D} \left( \Delta f (\mathbf{p}) \right)^2 \, \text{d}\mathbf{p}\,,\] where the Laplacian operator is defined as \[\Delta f = \frac{\partial^2 f}{\partial p_1^2} + \frac{\partial^2 f}{\partial p_2^2}\] and is a measure of the local curvature of the field \(f\,.\) More complex roughness penalties may be considered in the regularization term, with the PDE modeling the spatial variation of the unknown spatial field \(f\,,\) thereby accounting for spatial non-stationarities and anisotropies, as discussed in Wilhelm & Sangalli (2016).
The method for Generalized Spatial Regression with Partial
Differential Equation regularization presented in this vignette,
hereafter referred to as GSR-PDE, is implemented within the
newest version of the fdaPDE C++ library (Palummo et al., 2025). We load the library in
the working environment as follows:
As a benchmark application in environmental sciences, we consider fire count data recorded in Southern Italy, during 2021. In addition to fire counts per administrative division (province), the dataset also includes several areal covariates, such as area, elevation, average temperature, solar radiation, total precipitation, population, land use, and land cover, as of 2021. The dataset is freely and publicly available by different data sources, including the Fire Information for Resource Management System (FIRMS) of the National Aeronautics and Space Administration (NASA), the CORINE Land Cover (CLC) from the European Environment Agency (EEA), the Digital Elevation Model (DEM) from the National Institute of Geophysics and Volcanology (INGV), and the Copernicus ERA5 from the European Centre for Medium-Range Weather Forecasts (ECMWF) as part of the Copernicus Climate Change Service (C3S). Due to ongoing climate change and global warming, large-scale wildfires are occurring with increasing frequency worldwide, especially in southern Europe. This results in severe repercussions and damage to ecosystems, infrastructures, material assets, and socio-economic activities. Therefore, an in-depth statistical analysis of this phenomenon is becoming crucial for driving the implementation of new policies aimed at preserving both artificial and natural environments.
We import the preprocessed data observed in Southern Italy during
2021 from the shapefile
../data/GSRPDE_2D/GSRPDE_2D_data.shp as a sf
(Pebesma, 2025) object.
## [DATA]
# Load the data
data_sf <- st_read(dsn = "../data/GSRPDE_2D/GSRPDE_2D_data.shp", quiet = TRUE)
head(data_sf)
#> Simple feature collection with 6 features and 17 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 12.89095 ymin: 37.09486 xmax: 18.09681 ymax: 41.48737
#> Geodetic CRS: WGS 84
#> COUNTRY REGION PROVINCE TERRITORY AREA MIN_ELEV AVG_ELEV
#> 1 Italy Sicilia Agrigento Island 3025.283 0.5303 310.0765
#> 2 Italy Campania Avellino South 2803.711 92.7902 610.1539
#> 3 Italy Puglia Bari South 3797.713 1.0050 302.0524
#> 4 Italy Puglia Barletta-Andria-Trani South 1524.347 0.4824 217.2597
#> 5 Italy Campania Benevento South 2045.429 33.2261 479.9234
#> 6 Italy Puglia Brindisi South 1822.502 0.4948 109.3009
#> MAX_ELEV ARTIFIC AGRICUL FOREST WETLANDS WATER FIRE_COUNT POP REF_AREA
#> 1 1414.2872 0.0360 0.7901 0.1641 0.0002 0.0096 306 413177 3052.82
#> 2 1695.8716 0.0445 0.6325 0.3213 0.0002 0.0015 16 398932 2805.96
#> 3 659.2282 0.0633 0.8279 0.1074 0.0000 0.0014 75 1225048 3862.66
#> 4 657.4872 0.0448 0.8197 0.1005 0.0303 0.0047 26 379509 1542.99
#> 5 1719.1609 0.0371 0.7076 0.2527 0.0000 0.0026 17 263125 2080.37
#> 6 394.1095 0.0658 0.9121 0.0153 0.0018 0.0050 24 379522 1861.33
#> POP_DENS geometry
#> 1 135.3427 MULTIPOLYGON (((12.89625 37...
#> 2 142.1731 MULTIPOLYGON (((14.60641 40...
#> 3 317.1514 MULTIPOLYGON (((17.35441 40...
#> 4 245.9569 MULTIPOLYGON (((16.20329 40...
#> 5 126.4799 MULTIPOLYGON (((14.5728 41....
#> 6 203.8983 MULTIPOLYGON (((18.09681 40...
# Number of data (i.e., of provinces)
n <- nrow(data_sf)Before discussing the data in detail, we visualize the boundaries of
each province within the spatial domain of interest in Southern Italy.
These boundaries are already embedded in the sf object
loaded above. To enable interactive visualization, we use the
mapview (Appelhans et al.,
2023) R package.
## [SPATIAL DOMAIN]
# Provinces
provinces_sfc <- st_geometry(obj = data_sf)
provinces_sf <- st_as_sf(x = provinces_sfc)
provinces_sf
#> Simple feature collection with 33 features and 0 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 12.44208 ymin: 36.64986 xmax: 18.52069 ymax: 42.89375
#> Geodetic CRS: WGS 84
#> First 10 features:
#> x
#> 1 MULTIPOLYGON (((12.89625 37...
#> 2 MULTIPOLYGON (((14.60641 40...
#> 3 MULTIPOLYGON (((17.35441 40...
#> 4 MULTIPOLYGON (((16.20329 40...
#> 5 MULTIPOLYGON (((14.5728 41....
#> 6 MULTIPOLYGON (((18.09681 40...
#> 7 MULTIPOLYGON (((14.44479 37...
#> 8 MULTIPOLYGON (((14.50486 41...
#> 9 MULTIPOLYGON (((14.03236 40...
#> 10 MULTIPOLYGON (((14.79454 37...
# Boundaries of each province
boundary_sf <- st_boundary(x = provinces_sf)
boundary_sf
#> Simple feature collection with 33 features and 0 fields
#> Geometry type: MULTILINESTRING
#> Dimension: XY
#> Bounding box: xmin: 12.44208 ymin: 36.64986 xmax: 18.52069 ymax: 42.89375
#> Geodetic CRS: WGS 84
#> First 10 features:
#> x
#> 1 MULTILINESTRING ((12.89625 ...
#> 2 MULTILINESTRING ((14.60641 ...
#> 3 MULTILINESTRING ((17.35441 ...
#> 4 MULTILINESTRING ((16.20329 ...
#> 5 MULTILINESTRING ((14.5728 4...
#> 6 MULTILINESTRING ((18.09681 ...
#> 7 MULTILINESTRING ((14.44479 ...
#> 8 MULTILINESTRING ((14.50486 ...
#> 9 MULTILINESTRING ((14.03236 ...
#> 10 MULTILINESTRING ((14.79454 ...
# Interactive plot
mapview(provinces_sf, col.regions = "gray75", lwd = 1.5,
legend = FALSE, layer.name = "provinces")Figure 1: Boundaries of each of the 33 provinces in Southern Italy.
Now we build a regular mesh of the spatial domain
above. The mesh provides a discretization of the spatial domain, which
is required to solve the estimation problem using the finite element
method. To do so, we directly load mesh nodes and triangles from
../data/GSRPDE/GSRPDE_mesh_nodes.txt and
../data/GSRPDE/GSRPDE_mesh_triangles.txt files. These files
store a matrix of size #nodes-by-2 containing the \(x\) and \(y\) coordinates of the mesh nodes, and a
matrix of size #triangles-by-3, where the \(i\)th row contains the indices of the mesh
nodes forming the \(i\)th triangle.
These matrices are obtained using the femR (Clemente et al., 2025) and
RTriangle (Shewchuk & Sterratt,
2025) R packages; further details are omitted here. Other
software can be used to get the triangulation; for example, it can be
constructed through the fmesher (Lindgren, 2025) R package.
## [MESH]
# Load the mesh nodes
mesh_nodes <- read.table(file = "../data/GSRPDE_2D/GSRPDE_2D_mesh_nodes.txt")
head(mesh_nodes)
#> V1 V2
#> 1 17.02403 39.48284
#> 2 16.97541 39.49125
#> 3 16.86791 39.53792
#> 4 16.83847 39.55597
#> 5 16.81708 39.58875
#> 6 16.78153 39.61458
# Load the nodes markers
mesh_nodesmarkers <- read.table(file = "../data/GSRPDE_2D/GSRPDE_2D_mesh_nodesmarkers.txt")
head(mesh_nodesmarkers)
#> V1
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 1
#> 6 1
# Load the mesh triangles
mesh_triangles <- read.table(file = "../data/GSRPDE_2D/GSRPDE_2D_mesh_triangles.txt")
head(mesh_triangles)
#> V1 V2 V3
#> 1 934 283 949
#> 2 424 1068 1582
#> 3 283 284 949
#> 4 539 1296 1095
#> 5 273 271 272
#> 6 79 80 1082
# Set up the triangulation for fdaPDE
mesh <- triangulation(cells = mesh_triangles, nodes = mesh_nodes,
boundary = mesh_nodesmarkers)
# Nodes coordinates
head(mesh$nodes)
#> [,1] [,2]
#> [1,] 17.02403 39.48284
#> [2,] 16.97541 39.49125
#> [3,] 16.86791 39.53792
#> [4,] 16.83847 39.55597
#> [5,] 16.81708 39.58875
#> [6,] 16.78153 39.61458
# Number of nodes
mesh$n_nodes
#> [1] 2563
# Edges
head(mesh$edges)
#> [,1] [,2]
#> [1,] 283 934
#> [2,] 934 949
#> [3,] 283 949
#> [4,] 424 1068
#> [5,] 424 1582
#> [6,] 1068 1582
# Number of edges
mesh$n_edges
#> [1] 7075
# Triangles
head(mesh$cells)
#> [,1] [,2] [,3]
#> [1,] 934 283 949
#> [2,] 424 1068 1582
#> [3,] 283 284 949
#> [4,] 539 1296 1095
#> [5,] 273 271 272
#> [6,] 79 80 1082
# Number of triangles
mesh$n_cells
#> [1] 4514
# Bounding Box
mesh$bbox
#> [,1] [,2]
#> [1,] 12.44208 36.64986
#> [2,] 18.52069 42.89375We visualize the resulting mesh of the spatial domain of interest.
# Interactive plot
mapview(boundary_sf,
col.regions = "gray25", alpha.regions = 0.25, col = "black", lwd = 1.5,
legend = FALSE, layer.name = "domain") +
mapview(mesh, crs = 4326, col.regions = "transparent", lwd = 1.25,
legend = FALSE, layer.name = "mesh")Figure 2: Regular mesh of the spatial domain of interest of Southern Italy with \(2\,563\) nodes and \(4\,514\) triangles. Note that the mesh conforms to the boundaries of the provinces.
We use the triangulation just created to define the spatial support
of a geoframe object. This object will host layers
containing data observed over the spatial support.
We briefly introduce the data under study. We add a data layer to the
geoframe object defined above, thus obtaining a format
compatible with the implementation of the proposed regression
method.
## [DATA]
# Add layer with data to the geoframe object (directly from the shapefile)
italy$load_shp(layer = "fires", filename = "../data/GSRPDE_2D/GSRPDE_2D_data.shp")
italy
#> Geoframe with 1 layers
#> Bounding box: xmin: 12.44208 ymin: 36.64986 xmax: 18.52069 ymax: 42.89375
#> Number of nodes: 2563
#> Number of cells: 4514
#>
#> Layer: fires
#> Type: AREAL
#> Dims: 33, 17
#> First 6 data rows:
#> COUNTRY REGION PROVINCE TERRITORY AREA MIN_ELEV AVG_ELEV MAX_ELEV ARTIFIC AGRICUL FOREST WETLANDS WATER FIRE_COUNT POP REF_AREA POP_DENS
#> [0;31m<chr> [0m[0;31m<chr> [0m[0;31m<chr> [0m[0;31m<chr> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m[0;31m<flt64> [0m
#> Italy Sicilia Agrigento Island 3025.28 0.5303 310.076 1414.29 0.036 0.7901 0.1641 2e-04 0.0096 306 413177 3052.82 135.343
#> Italy Campania Avellino South 2803.71 92.7902 610.154 1695.87 0.0445 0.6325 0.3213 2e-04 0.0015 16 398932 2805.96 142.173
#> Italy Puglia Bari South 3797.71 1.005 302.052 659.228 0.0633 0.8279 0.1074 0 0.0014 75 1225048 3862.66 317.151
#> Italy Puglia Barletta-Andria-Trani South 1524.35 0.4824 217.26 657.487 0.0448 0.8197 0.1005 0.0303 0.0047 26 379509 1542.99 245.957
#> Italy Campania Benevento South 2045.43 33.2261 479.923 1719.16 0.0371 0.7076 0.2527 0 0.0026 17 263125 2080.37 126.48
#> Italy Puglia Brindisi South 1822.5 0.4948 109.301 394.11 0.0658 0.9121 0.0153 0.0018 0.005 24 379522 1861.33 203.898
# Variable names
names(italy[["fires"]])
#> [1] "COUNTRY" "REGION" "PROVINCE" "TERRITORY" "AREA"
#> [6] "MIN_ELEV" "AVG_ELEV" "MAX_ELEV" "ARTIFIC" "AGRICUL"
#> [11] "FOREST" "WETLANDS" "WATER" "FIRE_COUNT" "POP"
#> [16] "REF_AREA" "POP_DENS"
# Number of variables
ncol(italy[["fires"]])
#> [1] 17
# First province polygon nodes
head(gf_polygons(italy[["fires"]])[[1]]$nodes)
#> [,1] [,2]
#> [1,] 13.75042 37.14903
#> [2,] 13.82347 37.14458
#> [3,] 13.39980 37.55757
#> [4,] 13.35955 37.54314
#> [5,] 13.32142 37.58921
#> [6,] 13.56442 37.63658
# First province polygon edges
head(gf_polygons(italy[["fires"]])[[1]]$nodes)
#> [,1] [,2]
#> [1,] 13.75042 37.14903
#> [2,] 13.82347 37.14458
#> [3,] 13.39980 37.55757
#> [4,] 13.35955 37.54314
#> [5,] 13.32142 37.58921
#> [6,] 13.56442 37.63658The variables observed at the province level are given by:
“COUNTRY”, “REGION”, “PROVINCE”: names of the country, region, and province;
“TERRITORY”: type of territory (“South”, “Island”);
“AREA”: area (in \(\text{km}^2\));
“MIN_ELEV”, “AVG_ELEV”, “MAX_ELEV”: minimum, average, and maximum elevation (in \(\text{m}\));
“ARTIFIC”, “AGRICUL”, “FOREST”, “WETLANDS”, “WATER”: percentages of artificial areas, agricultural areas, forest and seminatural areas, wetlands, and water bodies;
“FIRE_COUNT”: number of fires recorded in 2021;
“POP”, “REF_AREA”, “POP_DENS”: population, reference area, and population density.
In addition, “geometry” contains the multipolygons defining the provinces under study.
The values of most of these variables, recorded by province in 2021,
are shown in the interactive plot below, created using
mapview.
# Color palettes
color_palette_fire_counts <- c("#FFFFB2", "#FECC5C", "#FD8D3C", "#F03B20", "#BD0026")
color_palette_population <- c("#DEEBF7", "#9ECAE1", "#6BAED6", "#3182BD", "#08519C")
color_palette_forest <- c("#E5F5E0", "#A1D99B", "#74C476", "#31A354", "#006D2C")
color_palette_elevation <- c("#006400", "#A2CD5A", "#F5DEB3", "#8B864E", "#8B2323")
# Interactive plot
mapview(italy[["fires"]], crs = 4326, varnames = c("FIRE_COUNT", "POP", "FOREST", "AVG_ELEV"),
color_palettes = list(color_palette_fire_counts,
color_palette_population,
color_palette_forest,
color_palette_elevation))Figure 3: Fire counts observed at 33 provinces in Southern Italy during 2021 (top left); population measured at the province level as of 2021 (top right); forest and seminatural areas (in \(\text{km}^2\)) measured at the province level as of 2021 (bottom left); average elevation (in \(\text{m}\)) measured at the province level as of 2021 (bottom right).
First, we focus on the FIRE_COUNT variable. The non-standard
shape of the spatial domain strongly influences the quantity under
study. For example, fire counts recorded in the two provinces facing the
Strait of Messina, namely the provinces of Messina in Sicily and
Reggio-Calabria in Calabria, are not highly correlated. This spatial
variation suggests that fire counts depends on factors beyond purely
environmental ones (e.g., temperature, elevation, land cover), which
typically vary smoothly across space. Specifically, differences in fire
counts may be partly due to the varying number of arson incidents
reported in 2021 in the two provinces, highlighting the role of
unobserved cultural and social phenomena, which may not be fully
captured by methods that rely solely on Euclidean distances.
Differently, GSR-PDE naturally accounts for the geometry of
the domain, which may feature disjoint regions with irregular boundaries
and concavities, as in the case under consideration.
We are ready to perform the GSR-PDE spatial regression
method using the gsr function. This function offers several
options for solving the regression problem, including the possibility to
specify covariates, PDE parameters, and boundary conditions, whenever
available. The gsr function supports several distributions
within the exponential family for generalized linear models and can
handle areal data, as in the case under study. In addition, the function
implements different algorithms for degrees of freedom computation and
criteria for optimal smoothing parameter selection. For further
information, see the documentation by typing ?gsr.
Here, in its simplest form, we apply a penalized Poisson regression
without covariates and with isotropic smoothing to the areal fire count
data, which corresponds to incorporating the Laplace operator in the
regularization term. Specifically, we assume that each response variable
\(Y_i\) is Poisson-distributed with
mean \(\mu_i\,,\) for \(i = 1,\ldots,n\,,\) with \(n = 33\) being the number of provinces in
the dataset. We consider the following model: \[\log(\mu_i) = \theta_i = f(\mathbf{p}_i) \,,
\qquad i = 1, \ldots, n\,.\] This can be achieved by specifying
the option family = poisson as a parameter of the
gsr function. We select the optimal value for the smoothing
parameter \(\lambda\) by minimizing the
Generalized Cross-Validation (GCV) index on a finite
grid of proposed values. To this end, we perform a stochastic
computation of the degrees of freedom.
## [ISOTROPIC SMOOTHING WITH OPTIMAL SMOOTHING PARAMETER]
# Set up the finite element function (order 1)
f_grid <- fe_function(mesh, type = "P1")
# Proposed values for the smoothing parameter
lambda_grid <- 10^seq(from = -6, to = 0, by = 0.25)
# Isotropic smoothing model
model_grid <- gsr(formula = FIRE_COUNT ~ f_grid, data = italy, family = "poisson")
# Isotropic smoothing fit with fixed lambda
fit_grid <- model_grid$fit(
calibrator = gcv(
optimizer = grid_search(lambda_grid)
)
)We print the regression model outputs.
# Fitted values at mesh nodes
head(f_grid$coeff)
#> [1] 4.860979 4.896364 5.041834 5.091835 5.132971 5.166168
# Fitted values at locations
head(model_grid$fitted)
#> [1] 5.849399 4.322543 4.284184 4.920359 4.128864 4.021816
# Residuals at locations: response - fitted values
grid_residuals <- c(italy[["fires"]]$FIRE_COUNT - model_grid$fitted)
summary(grid_residuals)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 4.228 21.080 73.160 131.982 197.474 501.911Moreover, it is possible to inspect the behavior of the GCV indices as a function of the values proposed for the smoothing parameter \(\lambda\,.\)
# Optimal value selected for the smoothing parameter
lambda_opt_grid <- fit_grid$optimum
lambda_opt_grid
#> [1] 0.03162278
# Plot of the GCV curve
par(family = "serif")
plot(x = log10(lambda_grid), y = fit_grid$values, type = 'b',
lwd = 2, xlab = TeX("$\\log_{10}(\\lambda)$"), ylab = "GCV")
grid()
abline(v = log10(lambda_opt_grid), lty = 2, lwd = 2, col = "royalblue")
legend("topleft", lty = 2, lwd = 2, col = "royalblue",
legend = TeX("$\\log_{10}(\\lambda_{opt})"))Figure 4: GCV curve.
The GCV curve is convex with minimum realized at the optimal value selected by the method – specifically, \(0.03162278\,.\)
The regression model fit over the provinces of interest is displayed below (left). For qualitative comparison, the fire count data is also displayed (right).
# Compute areal estimate by province
f_grid_areal <- eval_areal(x = f_grid, layer = italy[["fires"]], crs = 4326)
# Interactive plot
map1 <- mapview(f_grid_areal, col.regions = color_palette_fire_counts,
na.color = "transparent", layer.name = "ESTIMATE") +
mapview(boundary_sf,
col.regions = "transparent", alpha.regions = 0.25, col = "black", lwd = 1.5,
legend = FALSE, layer.name = "domain")
map2 <- mapview(italy[["fires"]], crs = 4326, varnames = "FIRE_COUNT",
color_palettes = list(color_palette_fire_counts)) +
mapview(boundary_sf,
col.regions = "transparent", alpha.regions = 0.25, col = "black", lwd = 1.5,
legend = FALSE, layer.name = "domain")
sync(map1, map2)Figure 5: Estimated spatial field of the fire counts provided by the
isotropic GSR-PDE for each province without covariates and
with \(\lambda\) selected via GCV
minimization using grid search method (left); fire counts observed at 33
provinces in Southern Italy during 2021 (right). We recall that the
latter are not used for estimation purposes; they are displayed here
solely to allow comparison with the fire counts estimate computed by the
proposed GSR-PDE method based on the observations loaded
above.
The smoothing fit already appears very accurate.
We show the pointwise spatial prediction. To evaluate the regression
model fit over a fine grid and enable interactive visualization of the
estimate, we internally create a new raster object. Then we
compute the fitted values over the grid using the $eval
method. We plot the resulting estimate with mapview: this
is just one possibility; various other plotting options are available
depending on the specific purpose.
The regression model fit over the fine grid is displayed below (left). In addition, we load and show the point pattern associated with the locations of fires occurred in 2021 (right).
# Load locations of fires
locations <- st_read(dsn = "../data/GSRPDE_2D/GSRPDE_2D_locations.shx", quiet = TRUE)
# Interactive plot
map1 <- mapview(f_grid, crs = 4326, col.regions = color_palette_fire_counts,
alpha.regions = 0.75, na.color = "transparent",
layer.name = "intensity") +
mapview(boundary_sf, color = "gray25", lwd = 1.5,
legend = FALSE, layer.name = "provinces")
map2 <- mapview(locations, legend = FALSE, col.regions = "black",
alpha.regions = 0.75, stroke = FALSE, cex = 2,
layer.name = "locations") +
mapview(boundary_sf, color = "gray25", lwd = 1.5,
legend = FALSE, layer.name = "provinces")
sync(map1, map2)Figure 6: Estimated spatial field of the fire counts provided by the
isotropic GSR-PDE without covariates and with \(\lambda\) selected via GCV minimization
using grid search method (left); locations of fires occurred in Southern
Italy in 2021 (right).
In particular, by focusing on the spatial point pattern rather than aggregated counts, the Poisson regression and the inhomogeneous Poisson point process share equivalent modeling structures, enabling a direct comparison of their pointwise intensity estimates. For density and intensity estimation with PDE regularization, see Ferraccioli et al. (2021) and Begu et al. (2024).
The phenomenon under study is influenced by several factors,
including the average elevation, the population, the type of territory,
as well as the extent of artificial areas, agricultural areas, forest
and seminatural areas, wetlands and water bodies, all measured at the
province level as of 2021. These covariates could be incorporated into
the GSR-PDE modeling framework to better explain fire count
data in Southern Italy.
Moreover, as a future development, the model fitting could leverage problem-specific information derived from the physics of the underlying factors influencing the phenomenon considered here. In particular, the regularization term could incorporate a PDE accounting for the presence of wind across the provinces.